Audio-Visual Speech Recognition Using New Lip Features Extracted from Side-Face Images

نویسندگان

  • Tomoaki Yoshinaga
  • Satoshi Tamura
  • Koji Iwano
  • Sadaoki Furui
چکیده

This paper proposes new visual features for audio-visual speech recognition using lip information extracted from side-face images. In order to increase the noise-robustness of speech recognition, we have proposed an audio-visual speech recognition method using speaker lip information extracted from side-face images taken by a small camera installed in a mobile device. Our previous method used only movement information of lips, measured by optical-flow analysis, as a visual feature. However, since shape information of lips is also obviously important, this paper attempts to combine lip-shape information with lip-movement information to improve the audio-visual speech recognition performance. A combination of an angle value between upper and lower lips (lip-angle) and its derivative is extracted as lip-shape features. Effectiveness of the lip-angle features has been evaluated under various SNR conditions. The proposed features improved recognition accuracies in all SNR conditions in comparison with audio-only recognition results. The best improvement of 8.0% in absolute value was obtained at 5dB SNR condition. Combining the lip-angle features with our previous features extracted by the optical-flow analysis yielded further improvement. These visual features were confirmed to be effective even when the audio HMM used in our method was adapted to noise by the MLLR method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are use...

متن کامل

Audio-visual speech recognition using lip movement extracted from side-face images

This paper proposes an audio-visual speech recognition method using lip movement extracted from side-face images to attempt to increase noise-robustness in mobile environments. Although most previous bimodal speech recognition methods use frontal face (lip) images, these methods are not easy for users since they need to hold a device with a camera in front of their face when talking. Our propos...

متن کامل

Real-time lip-tracking for lipreading

This paper presents a new approach to lip tracking for lipreading. Instead of only tracking features on lips, we propose to track lips along with other facial features such as pupils and nostril. In the new approach, the face is rst located in an image using a stochastic skin-color model, the eyes, lip-corners and nostrils are then located and tracked inside the facial region. The new approach ...

متن کامل

Development of Framework for Automatic Speech Recognition

In this paper we have proposed an automatic speech recognition framework using agents. In this we have included both audio recognition and visual recognition. The audio and visual modalities are complementary to each other and the combination of the two can improve the accuracy in affective user models. The audio features extracted are processed by audition agent. The visual processing agent ta...

متن کامل

Speaker and Speech recognition by Audio-Visual lip biometrics

This paper proposes a new robust bi-modal audio visual speech and speaker recognition system by lip-motion and speech biometrics. To increase the robustness of speech and speaker recognition, we have proposed a method using speaker lip motion information extracted from video sequences with low resolution (128 ×128 pixels). In this paper we investigate a biometric system for speech recognition a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004